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GenCore version 6.2.1 
Copyright (c) 1993 - 2008 Biocceleration Ltd. 



OM protein - protein search, using sw model 



June 24, 2008, 



:36:37 ; Search time 36 Seconds 
(without alignments) 
2493.618 Million cell updates/sec 



Sequence : 
Scoring table: 



US-10-552-515-1 



Perfect score: 4950 



1 MRMAATAWAGLQGPPLPTLC SELSSHWTPFTVPKASQLQQ 933 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283416 seqs, 96216763 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Fred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Query 

Score Match Length DB 



734 
288.5 
181.5 

117 
115.5 
110.5 
110.5 

110 



1049 
572 
946 
548 

3010 
680 
873 
792 



ID 

T22762 
F96755 
S48255 
148693 
GNWVTC 
T35404 
S46584 
T00487 



Description 

hypothetical prote 
hypothetical prote 
probable membrane 
natural resistance 
genome polyprotein 
probable squalene- 
probable membrane 
probable potassium 
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RESULT 1 
T22762 

hypothetical protein F56A8.1 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 09-Jul-2004 

C;Accession: T22762 
R;McMurray, A. 

submitted to the EMBL Data Library, December 1996 

A;Reference number: Z19612 
A;Accession: T22762 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 
A;Residues: 1-1049 <WIL> 

A;Cross-references: UNIPROT : 045572 ; UNIPARC : UPI000007F44C ; EMBL:Z83230; PIDN : CAB05741 . 1 ; GSPDB: 

GN00021; CESP:F56A8.1 

A; Experimental source: clone F56A8 

C;Genetics : 

A;Gene: CESP:F56A8.1 

A;Map position: 3 

A;Introns: 86/3; 146/3; 208/3; 245/2; 295/2; 325/2; 397/3; 532/3; 582/2; 612/3; 654/1; 677/1; 707/3; 
734/3; 786/2; 812/2; 870/1; 902/3; 942/3; 1011/2 
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C; Super family : Caenorhabditis elegans hypothetical protein F56A8.1 

Query Match 14.8%; Score 734; DB 2; Length 1049; 

Best Local Similarity 27.0%; Pred. No. 9.2e-54; 

Matches 248; Conservative 137; Mismatches 309; Indels 224; Gaps 32; 

Qy 96 AAACRAGSPAKP-RIA-DFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGL 153 

II 1111:1111! : I I I : : II I : : I I 

Db 3 AATTEVDYPYFPFRISIDFVLV HNAAESRS — KGKYREFFEKAVQKEGL 49 

Qy 154 CVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIP 213 

: I I I I I : I : I : I I I : : I I : : : I : I 

Db 50 IIRHQ — QSGQT — HFTLISTPFHRLTREAEMSQMCFPLKDCQVKP GLP 94 

Qy 214 NVLLEV VPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYG 269 

Db 95 SCCIPLSQIFVTDDTVRFINAPFQRKHGSLFLNYHDEKSFFTSSQRGYLTYQILTKIDIS 154 

Qy 270 HEKK NLLGIHQLLAEGVLSAAFPLH 294 

Db 155 KDLKGERLGESQDEPTDPSTSSITSDEQLRRKGLSWLLMSDVYEEAFVLHAPSKEEPYFK 214 

Qy 2 95 DGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLG 351 

: I I I I : I : I : I I I I : I I I :: I I I I I :: I I I I I I 

Db 215 AMQNGSVKAYNEFISEIELDPRRSLSLNWER WYKFQPLNKIRDYFGEQIAYYFAWQG 271 

Qy 352 FYTGWLLPAAWGTLVFLVGCFLVFSDIP TQEL-CGSKDS FE 392 

: I I I : I : I I : I II I : : I I : : | 

Db 272 TFLTLLWPAVIFGLWFIYGFIDSISSAPLDWNHCKWNFIGQTENVACGMRNGVTLFFS 331 

Qy 393 MCPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRW 452 

I 1:11 II I I : : I I : : I : : : | | | : : | : | : | 

Db 332 MVTQ WFMSS FDTKMNAFFAVFMSIWGSVFVQIWKRNNSVLSYQW 375 

Qy 453 DCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAWV 512 

: I : I 11:1 I : I I I I I I I : I : I I I I : : I I I 

Db 376 NSDDFHAIEP-DRPEFRGS— KVKEDPITGEDIWISPALARYIKMLASFVFVSFSMLWV 432 

Qy 513 MCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLT 572 

: I : : I : I : III: I : : I : : I I I : I I 

Db 433 ISLMLVTLLKIWMVYNFQCTKEYTFHCWLS— AAFLPSILNTLSAMGLGAIYSNLVSRLN 490 

Qy 573 RWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNE 529 

Db 491 SWENHRTESEHNNSLIVKIFAFQMVNTYTSLFYVAFIRPESHGLQPN — GLFGLGTEFKD 548 

Qy 630 ECAAGGCLIELAQELLVIMVGKQVINNMQEVLIP KLKGWWQKFRLRSKKRKAGASAG 686 

Db 549 TCLDDTCSSLLALQLLTHTLIKPFPKFFKDVVLPYFVKL FRLRMYTSRTEARVE 502 

Qy 6 87 ASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARK 7 46 

III: I : I : I I I I I I I : : I : : I : I I : 

Db 603 I EDDDQ ANVLMFASLFPLAPLLALIIGFVDMRIDAHR 639 

Qy 747 FVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRG 806 

: I : I : II II II I : II : I I I : : I I : I I 

Db 6 40 LIWFNRKPIPIITNGIGIWLPILTFLQYCAVFTNAFIVAFTSGFC 684 

Qy 807 FLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYSQTYWNLLAIRLAFVIVFEHVVFSVGR 866 

I : I I III I I II I I I I : : : I I : 

Db 685 STFLA DGAYC-TVQN RLIIVIVFQNLVFGLKY 715 

Qy 867 LLDLLVPDIPESVEIKVKREYY LAKQA-LAENEVLFGT 903 
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II :: I I I I ::::::: I : I I I : I I : : : 

Db 716 LLSSVIPSIPASIKLALRKKRYWAHIVEKGDVPHRTRIKKRTRIAKLAWIASNQKMIKS 775 

Qy 904 NGTKDEQPKGSELSSHWT 921 

I I : : I 1:1 
Db 7 76 NRKKEKSNK KHFT 788 



RESULT 2 
F96755 

hypothetical protein F3N23.22 [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Mar-2001 #sequence_revision 02-Mar-2001 #text_change 09-Jul-2004 

C;Accession: F96755 

R;Theologis, A.; Ecker, J.R.; Palm, C.J.; Federspiel, N.A.; Kaul, S . ; White, 0.; Alonso, J.; Altaf, 
H.; Araujo, R. ; Bowman, C.L.; Brooks, S.Y.; Buehler, E . ; Chan, A.; Chao, Q.; Chen, H.; Cheuk, R.F.; 
Chin, C.W.; Chung, M.K.; Conn, L.; Conway, A.B.; Conway, A.R.; Creasy, T.H.; Dewar, K.; Dunn, P.; 
Etgu, P.; Feldblyum, T.V.; Feng, J.; Feng, B.; Fujii, C.Y.; Gill, J.E.; Goldsmith, A.D.; Haas, B.; 
Hansen, N.F.; Hughes, B.; Huizar, L. 
Nature 408, 815-820, 2000 

A;Authors: Hunter, J.L.; Jenkins, J.; Johnson-Hopson, C . ; Khan, S . ; Khaykin, E.; Kim, C.J.; Koo, H. 
L.; Kremenetskaia, I.; Kurtz, D.B.; Kwan, A.; Lam, B.; Langin-Hooper , S . ; Lee, A.; Lee, J.M.; Lenz, 
C.A.; Li, J.H.; Li, Y.; Lin, X.; Liu, S.X.; Liu, Z.A.; Luros, J.S.; Haiti, R.; Marziali, A.; 
Militscher, J.; Miranda, M.; Nguyen, M.; Nierman, W.C.; Osborne, B.I.; Pai, G.; Peterson, J.; Pham, 
P.K.; Rizzo, M. ; Rooney, T . ; Rowley, D.; Sakano, H. 

A;Authors: Salzberg, S.L.; Schwartz, J.R.; Shinn, P.; Southwick, A.M.; Sun, H.; Tallon, L.J.; 

Tambunga, G.; Toriumi, M.J.; Town, CO.; Utterback, T.; van Aken, S.; Vaysberg, M. ; Vysotskaia, V. 

S.; Walker, M. ; Wu, D.; Yu, G. ; Fraser, C.M.; Venter, J.C.; Davis, R.W. 

A; Title: Sequence and analysis of chromosome 1 of the plant Arabidopsis. 

A;Reference number: A86141; MUID : 21016 719 ; PMID: 11130712 

A; Accession : F96755 

A; Status: preliminary 

A; Molecule type: DNA 

A;Residues: 1-572 <STO> 

A; Cross-references: UNIPROT : Q9SSM5 ; UNIPARC:UPI00000A63BD; GB:AE005173; NID : g5903091 ; PIDN: 

AAD55649.1; GSPDB:GN00141 

C;Genetics: 

A; Gene: F3N23.22 

A;Map position: 1 

Query Match 5.8%; Score 288.5; DB 2; Length 572; 

Best Local Similarity 20.5%; Pred. No. 3.4e-16; 

Matches 169; Conservative 97; Mismatches 236; Indels 323; Gaps 29; 

Qy 142 ETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASN 201 

I : I I I : I I : I : | | : : | : | | | : 

Db 32 EVLVTELRKKGMWDR WGLAHEFLKVAAPSEILGNAAAE 71 



2 02 WSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFE 251 

72 LHIRKPTRLGI DLPFEMQGSEAFIRQPDGLLFS WFERFRCYQHLIY 117 

262 ILAKTPYGHEKKNLLG IHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPR 309 

118 GIVNSG-GHDVTLKLDGREFCWTAGESLLRRLESEGVIKQMFPLHDE 153 



310 LNQRQVLFQHWA-RWGKWN-KYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLV 35 7 

: I : I I : I I I II I I : I : I I I I 
164 -LKRKELLQNWALNW — WNCTNQPIDQIYSYFGAK 195 



368 FLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFS 427 

I : I 

196 ELIKNLGN 203 
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Qy 428 LFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPY 487 

I I I I II 

Db 204 ERAKEKEAYQRYEW 217 

Qy 488 FPERSRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGNTLLAAWASRIASL 547 

I III :: I :::::: I : | | | : | : | | 

Db 218 FAYRKRFRN DVLVIMSIICLQLPFELAYAHIFEIITSDIIKYVLTA 263 

Qy 548 TGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIA 607 

Db 264 lYLLIIQYLTRLGGKVSVKLINREINESVEYRANSLIYKVF GLYFMQTYIG 314 

Qy 6 08 FFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKG 667 

I III II : I I I : : : I I : : : | | | 

Db 315 IF YHVLLH-RN FMTLRQVLIQRLIISQVFWTLMDGSLPYLKY 355 

Qy 6 68 WWQKFRLRSKKR-KAGASAGASQ — GPWEDDY ELVPCEGLFDEYLEMVL 713 

Db 356 SYRKYRARTKKKMEDGSSTGKIQIASRVEKEYFKPTYSASIGVELE — DGLFDDSLELAL 413 

Qy 714 QFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLT 773 

III::! I I I I I : : I : I I I : I I : III: I III:] I 
Db 414 QFGMIMMFACAFPLAFALAAVSNVMEIRTNALKLLVTLRRPLPRAAATIGAWLNIWQFLV 473 

Qy 774 HLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFR 833 

: : : : I : I I II 
Db 474 VMSICTNSALL VCLY 488 

Qy 834 DDDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEI-KVK REY 887 

1:1: : I I : : : I I I : : I I I I : I I : I I : : 

Db 489 DQEGKWK lEPGLAAILIMEHVLLLLKFGLSRLVPEEPAWVRASRVKNVTQAQDM 542 

Qy 888 YLAKQALAENEVLFGTNGTKDEQPKGSELSSHWTPFTVPKASQLQ 932 

I I I I : I : I I : I I 

Db 543 Y-CKQLL RSISGEFNSLTKPEQEQQQ 567 



RESULT 3 
S48255 

probable membrane protein YBR086c - yeast ( Saccharomyces cerevisiae) 
N;Alternate names: hypothetical protein YBR0809 

C; Species: Saccharomyces cerevisiae 

C;Date: 03-Aug-1995 #sequence_revision ll-Aug-1995 #text_change 09-Jul-2004 

C;Accession: S48255; S45954; S44670 

R;Mannhaupt, G.; Stucka, R.; Ehnle, S.; Vetter, I.; Feldmann, H. 
Yeast 10, 1363-1381, 1994 

A; Title: Analysis of a 70 kb region on the right arm of yeast chromosome II. 
A;Reference number: S48255; MUID : 95208357 ; PMID:7900426 
A;Accession: S48255 

A; Status: nucleic acid sequence not shown 
A;Molecule type: DNA 
A;Residues: 1-946 <MAN> 

A;Cross-ref erences : UNIPROT : P38250 ; UNIPARC : UPI0000036C25 ; EMBL:X78993; NID:g476045; PIDN: 
CAA55593.1; PID:g476046 

R;Feldmann, H.; Mannhaupt, G.; Schwarzlose, C.; Vetter, I. 
submitted to the Protein Sequence Database, August 1994 
A;Reference number: S45927 
A;Accession: S45954 
A; Molecule type: DNA 
A;Residues: 1-946 <FE2> 

A;Cross-references: UNIPARC : UPI0000036C25 ; EMBL:Z35955; NID:g536351; PID:g536352; MIPS:YBR086c 
C;Genetics: 
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A; Gene: SGD:IST2 

A; Cross-references: SGD:S0000290 
A; Map position: 2R 

C; Super family : Saccharomyces cerevisiae probable membrane protein YBR086c 
C; Keywords: transmembrane protein 

F; 131-147/Domain: transmembrane #status predicted <TM1> 
F; 158-174/Domain: transmembrane #status predicted <TM2> 
F; 207-243/Domain: transmembrane #status predicted <TM3> 
F; 248-274/Domain: transmembrane #status predicted <TM4> 
F;302-324/Domain: transmembrane #status predicted <TM5> 
F; 450-477/Domain: transmembrane #status predicted <TM6> 
F; 506-532/Domain: transmembrane #status predicted <TM7> 
F; 563-588/Domain: transmembrane #status predicted <TM8> 

Query Match 3.7%; Score 181.5; DB 2; Length 945; 

Best Local Similarity 18.6%; Pred. No. 8.6e-07; 

Matches 118; Conservative 99; Mismatches 243; Indels 173; Gaps 25; 
Qy 3 42 KVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCP 401 

Db 119 KQSLYFAFLQNYIKWLIPFSFFGLSIRFLSNF 150 

Qy 402 FWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRK SATLAYRWDCSD 456 

: : I : I I I I I : : I I II: I 

Db 151 TYEFNST — YSLFAILWTLSFTAFWLYKYEPFWSDRLSKYSSFST 193 

Qy 457 YEDTEERPRPQFAASAPMTAPN PITGEDEPYFPERSRARRMLAGSWIWMVAWV 512 

I : : : : I I I : : I : : I I : : : : : : 
Db 194 lEFLQDKQKAQKKASSVIMLKKCCFIPVA LLFGA ILLSFQL 234 

Qy 513 MCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVS-LAHVL 571 

I 11:1 : I : II : : : I : I : I I : 

Db 235 YCFALEIFYKQIY NGPMI SILSFLPTILICTFTPVLTVIYNKYFVEPM 282 

Qy 572 TRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEEC 631 

I : I I I : : : I I : I : : I I : I I : I : : I : 

Db 283 TKWENHSSWNAKKSKEAKNFVIIFLSSY-VPLLITL FLYLPMGHLLTAEIRTKVF 337 

Qy 632 AAGGCLIEL AQELLVIMVGKQVINNMQEVLIPKLKGWWQK 6 71 

III : : I I : I I : I I I : 

Db 338 NAFSILARLPTHDSDFIIDTKRYEDQFFYFIVINQLIQFSMENFVPSLVSIAQQKINGPN 397 

Qy 6 72 FRLRSKKRKAGASAGASQGPWE — DDYELVPCEGLFD EYLEMVLQFGFVTIFVA 723 

Db 398 PNFVKAESEIGKAQLSS-SDMKIWSKVKSYQTDPWGATFDLDANFKKLLLQFGYLVMFST 456 

Qy 724 ACPLAPLFALLNNWVEIRLDARKFVC EY-RRPVAERAQD IGIWFHILA 770 

Db 457 IWPLAPFICLIVNLIVYQVDLRKAVLYSKPEYFPFPIYDKPSSVSNTQKLTVGLWNSVLV 516 

Qy 771 GLTHL-AVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRY 829 

: I I I : : I : I : : I I 

Db 517 MFSILGCVITATLTYMYQSCNIP GVGAHTSIHTNKAWY 554 

Qy 830 RAFRDDDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYL 889 

I : : : I : : : : | | | : : | : : : | : : : : 

Db 555 LA NPINHSWINI VLYAVFIEHVSVAIFFLFSSILKSSHDDVANGIVPKHW 605 

Qy 8 90 AKQALAENEVL FGTNGTKD-EQPKGS 914 

I : I I I : I I : I I I I 

Db 6 06 NVQNPPKQEVFEKIPSPEFNSNNEKELVQRKGS 638 
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RESULT 4 
148693 

natural resistance-associated macrophage protein 1 - mouse 
N;Alternate names: transport system membrane protein Nramp 
C; Species: Mus musculus (house mouse) 

C;Date: 02-Jul-1996 #sequence_revision 02-Jul-1996 #text_change 09-Jul-2004 
C;Accession: 148693; A57071; A40739 

R;Barton, C.H.; White, J.K.; Roach, T.I.; Blackwell, J.M. 
J. Exp. Med. 179, 1683-1687, 1994 

A; Title: NH2-terminal sequence of macrophage-expressed natural resistance-associated macrophage 
protein (Nramp) encodes a proline/serine-rich putative Src homology 3-binding domain. 
A;Reference number: 148693; MUID : 9 4216 838 ; PMID:7513015 
A;Accession: 148693 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-548 <RES> 

A;Cross-ref erences : UNIPROT:P41251; UNIPARC:UPI000002770F; EMBL:X75355; NID:g505155; PIDN: 
CAA53102.1; PID:g505156 

R;Govoni, G.; Vidal, S.; Cellier, M. ; Lepage, P.; Malo, D.; Gros, P. 

Genomics 27, 9-19, 1995 

A; Title: Genomic structure, promoter sequence, and induction of expression of the mouse Nrampl gene 
in macrophages. 

A;Reference number: A57071; MUID: 95394476 ; PMID:7665187 
A;Accession: A57071 

A; Status: preliminary; not compared with conceptual translation 
A; Molecule type: DNA 
A;Residues: 1-548 <GOV> 

A;Cross-ref erences : UNIPARC:UPI000002770F 

R; Vidal, S.M.; Malo, D.; Vogan, K. ; Skamene, E.; Gros, P. 

Cell 73, 469-485, 1993 

A; Title: Natural resistance to infection with intracellular parasites: isolation of a candidate for 
Beg. 

A;Reference number: A40739; MUID : 93258812 ; PMID:8490962 
A;Accession: A40739 
A; Status: preliminary 
A;Molecule type: nucleic acid 
A;Residues: 65-548 <VID> 

A; Cross-references : UNIPARC : UPI0000178BFA 

A;Note: sequence extracted from NCBI backbone (NCBIN: 131666, NCBIP : 13166 7 ) 
C; Super family : natural resistance-associated macrophage protein 1 

Query Match 2.4%; Score 117; DB 2; Length 548; 

Best Local Similarity 20.2%; Pred. No. 0.13; 

Matches 136; Conservative 89; Mismatches 207; Indels 240; Gaps 34; 



Qy 



6 7 VLIDVSPPEAEKRGSYGSTAHASEPG-GQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQ 125 



Db 



1 MISDKSPPRL-SRPSYGSI- 



-SSLPGPAPQPAPCR- 



-ETYLSEKIP 41 



Qy 



126 QDSAARDRTDMHRTWRET— 



FLD — NLRAAGLCVDQQDVQDGNTTVHYALLSA 174 



Db 



42 IPSADQGTFSLRKLWAFTGPGFLMSIAFLDPGNI- 



: I : I I I 

-ESDLQAGAVAGFKLLWVL 93 



Db 



Qy 



175 SWA VLC YYAEDLRLKL PLQELPNQ 198 

II : I I I I : I : I : | | : 

94 LWATVLGLLCQRLAARLGWTGKDLGEVCHLYYPKVPRILLWLTIELAIVGSDMQEVIGT 153 



Qy 



199 ASNW SAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKR 2 55 



Db 



154 AISFNLLSAGRIPLWG — GVLITIV-DTFFFLFLDNYGLRKLEAFFG 197 



Qy 



256 HQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQV 315 
: I I : I I : I : I : I : : I I I I Mil: : 
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Db 198 — LLITIMALT-FGYE YWAHP — SQGALLKGLVLPTCPGCGQPELLQAVGIVGAII 249 

Qy 316 LFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFL- 374 

: : I : : I I I : : I I I : I : : : I : : I : 

Db 250 MPHNIYLHSALVKSREVDRTRRVDVREANMYF LIEATIALSVSFIINLFVM 300 

Qy 3 75 -VFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFD HGG 422 

Db 301 AVFGQAFYQQT — NEEAFNIC ANSSLQNYAKIFPRDNNTVSVDIYQGG 346 

Qy 423 TVFFSLF MALWAVLLLEYWKRKSATLAY RWDCSDYEDTEERPRP 466 

: II : : I I I I I : : I I II 
Db 347 VILGCLFGPAALYIWAVGLLAAGQSSTMTGTYAGQFVMEGFLKLRW 392 

Qy 46 7 QFAASAPMTAPNPITGEDEPYFPERSR-ARRMLAGSWIV — VMVAV V 511 

Db 393 SRFARVLLTRSCAILPTVLVAVFRDLKDLSGLNDL 427 

Qy 512 VMCLVSIILYRAIMAIWSRSGNTLLAAWAS-RIASLTGS WNLVFILILSKI 563 

Db 428 LNVLQSLLLPFAVLPILTFTSMPAVMQEFANGRMSKAITSCIMALVCAINLYFVI SY 484 

Qy 56 4 YVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAF FKGRFVGYPG 618 

III I : I : I : : | : | : I : : 

Db 485 LPSLPH PAYFGLVALFA-IGYLGLTAYLAWTCCIAHGATFLTHSS 528 

Qy 619 NYHTLFGVRNEE 630 

: I 1:1: III 
Db 529 HKHFLYGLPNEE 540 



RESULT 5 
GNWVTC 

genome polyprotein - hepatitis C virus 

N;Contains: capsid protein C; envelope protein M; hepacivirin (EC 3.4.21.98) (nonstructural protein 
NS3); major envelope protein E; nonstructural protein NSl; nonstructural protein NS2; nonstructural 
protein NS4a; nonstructural protein NS4b; nonstructural protein NS5 
C; Species: hepatitis C virus 

C;Date: 31-Mar-1992 #sequence_revision 31-Mar-1992 #text_change 09-Jul-2004 
C;Accession: A38465 

R; Takamizawa, A.; Mori, C; Fuke, I.; Manabe, S.; Murakami, S.; Fujita, J.; Onishi, E.; Andoh, T.; 

Yoshida, I.; Okayama, H. 

J. Virol. 65, 1105-1113, 1991 

A; Title: Structure and organization of the hepatitis C virus genome isolated from human carriers. 
A;Reference number: A38465; MUID : 91140698 ; PMID:1847440 
A;Accession: A38465 
A;Molecule type: genomic RNA 
A;Residues: 1-3010 <TAK> 

A;Cross-references: UNIPROT : P26663 ; UNIPARC : UPI0000131E1C ; EMBL:M58335; NID:g329770; PIDN: 
AAA72945.1; PID:g329771 

C; Superfamily : hepatitis C virus genome polyprotein 

C;Keywords: ATP; capsid protein; envelope protein; glycoprotein; hydrolase; nonstructural protein; 

nucleotide binding; P-loop; polyprotein; serine proteinase; transmembrane protein 

F; 2-115/Product : capsid protein C #status predicted <CPC> 

F; 116-191/Product : envelope protein M #status predicted <EPM> 

F; 192-389/Product : major envelope protein E #status predicted <MEE> 

F;390-729/Product: nonstructural protein NSl #status predicted <NS1> 

F; 730-1006/Product : nonstructural protein NS2 #status predicted <NS2> 

F; 1007-1615/Product : hepacivirin #status predicted <NS3> 

F; 1230-1237/Region: nucleotide-binding motif A (P-loop) 

F; 1312-1317/Region: nucleotide-binding motif B 

F; 1316-1319/Region: DEXH motif 

F; 1616-1862/Product : nonstructural protein NS4a #status predicted <N4A> 
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F; 1863-2013/Product : nonstructural protein NS4b #status predicted <N4B> 
F; 2014-3010/Product : nonstructural protein NS5 #status predicted <NS5> 

F; 196, 209, 234, 250, 305, 325, 417, 423, 430, 448, 532, 540, 556, 576, 623, 645, 1213, 1255, 2041, 2077, 2240, 2529, 2788/ 
Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 2.3%; Score 115.5; DB 1; Length 3010; 

Best Local Similarity 19.7%; Pred. No. 1.7; 

Matches 163; Conservative 83; Mismatches 285; Indels 295; Gaps 40; 

Qy 128 SAARDRTDMHRTWRETFLDNLRAAGLCVDQ QDVQDGNTTVHYALLSASWAVLCYY 182 

II II II III III I : : I I I I I 

Db 314 SGHRMAWDMMMNWSPT TALVVSQLLRIPQAWDMVAGAHWGVL AGLAYY 362 

Qy 183 AEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLG 2 42 

: I I I : I : I II 
Db 363 SMAGNWAKVLIVML LFAG 380 

Qy 243 SDNQDTFFT STKRHQILF EILAKTPYGHEKKNLLGIHQLLAEGVLS 288 

Db 381 VDG-DTHVTGGAQAKTTNRLVSMFASGPSQKIQLINTNGSWHINRTALNCNDSLQTGFLA 439 

Qy 289 AAFPLHDGPFKTPPE GP QAPRLNQRQVLFQH 319 

III II II : : I : I I : : 

Db 4 40 ALFYTHSFNSSGCPERMAQCRTIDKFDQGWGPITYAESSRSDQRPYCWHYPPPQCTIVPA 499 

Qy 320 WARWGK WNKYQPLDHVRRYFGEKVALY 346 

III: 1:1 : I I 
Db 500 SEVCGPVYCFTPSPWVGTTDRFGVPTYRWGENETDVLLLNNTRPPQ — GNWFG 551 

Qy 347 FAWL GFYTGWLLPAAWG TLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCL 398 

I : II I : I II II : I III : I I : 

Db 552 CTWMNSTGFTKTCGGPPCNIGGVGNNTLTCPTDCFRKHPE-ATYTKCGSGP — WLTPRCM 608 

Qy 399 -DCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDY 457 

II: I I : : I I I : : I | : | : | : | | | : 

Db 609 VDYPY RLWHYPCTVNFTIFKVRMYVGGVEH — RLNA — ACNWTRGER 651 

Qy 458 EDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVS 517 

11:111:: I : II | : : | | : : : | | 

Db 652 CDLEDRDRPELSPLLLSTTEWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGIGSAWS 711 

Qy 518 IIL YRAIMAIWSRSGNTLLAAWAS-RIASLTGSWNLVFILILSKIYVSLAHVLTR 573 

: I : : : : : : I II : : I I I : I : I : I I 
Db 712 FAIKWEYVLLLFLLLA-DARVCACLWMMLLIAQAEAALENLV VLNSASVAGAH 763 

Qy 574 WEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGV 525 

Db 764 GILSFLVFFCAAWYI KGRLV — PGATYALYGVWPLLLLL 800 

Qy 627 RNEECAAGGCLIELAQELLVIMVGKQV — INNMQEVLIPKLKGWWQKFR 673 

Db 801 LALPPRAYAMDREMAASCGG AVFVGLVLLTLSPYYKVFLARLIWWLQYFT 8 50 

Qy 6 74 LRSKKRKAGASAGASQGPW EDDYELVPC EGLFDEYLEMVLQFGFVTI 720 

I II I I I : I I : I I : : I : : 

Db 851 TR AEADLHVWIPPLNARGGRDAIILLMCAVHPELIFDITKLLIAILGPLMV 901 

Qy 721 FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLT H 7 74 

I II : I I I I I : : I I I I | 

Db 902 LQAGITRVPYF VRAQGLIHACMLVRKVA-GGHYVQMAFMKLGALTGTYIYNH 952 

Qy 7 75 LAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFA 820 

I : : III I : : I I I : 
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Db 953 LTPLRD WPRA GLRDLAVAVEPWFS 977 



RESULT 6 
T35404 

probable squalene-hopene cyclase - Streptomyces coelicolor 
C; Species: Streptomyces coelicolor 

C;Date: 05-Nov-1999 #sequence_revision 05-Nov-1999 #text_change 09-Jul-2004 
C;Accession: T35404 

R;01iver, K.; Harris, D.; Bentley, S.D.; Parkhill, J.; Barrell, B.G.; Ra jandream, M.A. 

submitted to the EMBL Data Library, March 1999 
A;Reference number: Z21577 
A;Accession: T35404 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-680 <OLI> 

A;Cross-ref erences : UNIPROT : Q9X7V9 ; UNIPARC : UPI00000DAF47 ; EMBL : AL049485 ; PIDN: CAB39697 . 1 ; GSPDB: 

GN00070; SCOEDB:SC6A5. 13 

A; Experimental source: strain A3 (2) 

C;Genetics: 

A; Gene: SCOEDB : SC6A5 . 13 

C; Super family : squalene-hopene cyclase 

Query Match 2.2%; Score 110.5; DB 2; Length 680; 

Best Local Similarity 22.8%; Pred. No. 0.61; 

Matches 112; Conservative 52; Mismatches 173; Indels 155; Gaps 27; 

Qy 40 MTSETSSGSHCARSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAAC 99 

I I : I : I I I I I I : I I : I : I : I I I 

Db 1 MTA-TTDGSTGASLRPLAASASDTDITI PAAAAGVPEAAA- 39 



Qy 100 RAGSPAKPRIADFVLV WEEDLKLDRQQDSAARDRTDMHRTWRETFL DN 147 

I I I I : I I : I I : : II: II : 

Db 40 RATRRATDFLLAKQDAEGWWKGDL ETNVTMDAEDL LLRQFLGIQDEET 87 



Qy 148 LRAAGLCVDQQDVQDGNTTVHY-ALLSASWAVLCYYAEDLRLKLPLQELPN — QASNW — 202 

I I I I : : : I I I I : I I I I I | | : : | : | 

Db 88 TRAAALFIRGEQREDGTWATFYGGPGELSTTIEAYVA— LRLAGDSPEAPHMARAAEWIR 145 



Qy 203 SAGLLA WLGIPN-VLLEWPDVPPE — YYSCRFRVNKLPRFLGSDNQDTFFT 251 

I I : I II: : : I : : I I I I : I : : : I I 

Db 146 SRGGIASARVFTRIWLALFGWWKWDDLPELPPELIYF PTWVPLNIYD— FG 194 



Qy 252 STKRHQI — LFE ILAKTPYGHEKKNLLGIHQLLAEGVLSAAFP LHDGPFKTPPEGPQ 306 

111:111 I I I I 111:11 

Db 195 CWARQTIVPLTIVSAKRP VRPAPFPLDELHTDPARPNPPRPL 235 



Qy 307 AP RLNQ RQVLFQHWARW GKWNKYQPLDHVR 336 

II I : : : I : III I I I I 
Db 237 APVASWDGAFQRIDKALHAYRKVAPRRLRRAAMNSAARWIIERQENDGCWGGIQP 291 



Qy 337 RYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPL 396 
Db 292 PAVYSVIALYLLGYDLEHPVMRAGLESLDRFAVWRE DGARMIEA 335 



Qy 397 CLDCPFW-LLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATL AY 450 

I II : I II I I II I I : I : : II I I : 

Db 336 C-QSPVWDTCLATIALADAGVPEDHPQLVKASDWMLGEQIVRPGDWSVKRPGLPPGGWAF 394 



Qy 451 RWDCSDYEDTEE 462 

: : I I : : 
Db 395 EFHNDNYPDIDD 406 
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RESULT 7 
S46584 

probable membrane protein YJL094c - yeast ( Saccharomyces cerevisiae) 
N;Alternate names: hypothetical protein J0909 
C; Species: Saccharomyces cerevisiae 

C;Date: 13-Jan-1995 #sequence_revision 13-Jan-1995 #text_change 09-Jul-2004 
C;Accession: S46584; S56871; S47057 
R;Miosga, T.; Witzel, A.; Zimmermann, F.K. 
Yeast 10, 965-973, 1994 

A; Title: Sequence and function analysis of a 9.46 kb fragment of Saccharomyces cerevisiae chromosome 
X. 

A;Reference number: S46584; MUID : 95076 716 ; PMID:7985424 

A;Accession: S46584 
A;Molecule type: DNA 
A;Residues: 1-873 <MIO> 

A;Cross-ref erences : UNIPROT : P40309 ; UNIPARC : UPI000013B5C4 ; EMBL:X77087; NID:g521093; PIDN: 
CAA54359.1; PID:g521094 

A;Note: the authors translated the codon TCC for residue 645 as Trp 

R;Miosga, T.; Schaaf f-Gerstenschlaeger , I.; Baur, A.; Boles, E.; Chalwatzis, N.; Fournier, C; 

Schmitt, S.; Velten, C; Wilhelm, N.; Witzel, A.; Zimmermann, F.K. 

submitted to the Protein Sequence Database, September 1995 

A;Reference number: S56855 

A;Accession: S56871 

A; Molecule type: DNA 

A;Residues: 1-873 <MIW> 

A; Cross-references: UNIPARC : UPI000013B5C4 ; EMBL:Z49369; NID:gl008267; PID : gl008268 ; MIPS:YJL094c 

C;Genetics: 

A; Gene: SGD:KHA1 

A;Cross-ref erences: SGD:S0003630 

A;Map position: lOL 

C; Keywords: transmembrane protein 

Query Match 2.2%; Score 110.5; DB 2; Length 873; 

Best Local Similarity 21.4%; Pred. No. 0.86; 

Matches 69; Conservative 56; Mismatches 106; Indels 91; Gaps 16; 

Qy 503 VIWMVAVWMCLVSIILYRAI— MAIWSRSGNTLLAA W ASRIASL 547 

I : I : I I : : | | : : : : | : | : | | | I : : : | 

Db 157 VFMVFIAVSISVTAFPVLCRILNELRLIKDRAGIWLAAGIINDIMGWILLALSIILSSA 216 



Qy 548 TGSWNLVFILILS KIYVSLAHVLTRWEMHRT QTKFEDAFTLKVFIFQFVNF 599 

11111:11::: II I I I : I I : : I I : : I : : 

Db 217 EGSPVNTVYILLITFAWFLIYFFPLKYLLRWVLIRTHELDRSKPSPLATMCILFIMFISA 276 



Qy 600 YSS PVYIAFFKGRFVGYPGNYHTLFGVRNEEC AAGGCLIELAQ- 642 

Db 277 YFTDIIGVHPIFGAFIAGLVVPRDDHYWKLTERMEDIPNIVFIPIYFAVAGLNVDLTLL 336 



Qy 643 ELLVIMVGKQVINNMQEVLIPKLKG-WWQKFRLRSKKRKAGASAGASQGP 691 

Db 337 NEGRDWGYVFATIGIAIFTKIISG TLTAKLTGLFWRE ATAAGV 379 



Qy 692 WEDDYELVPCEGLFD-EYLEMVLQFGFVT IFVAACPLAPLFALLNNWVEIRLDAR 745 

I : I : I : : I : I I : : : I I I I I : : : I | 

Db 380 LMSCKGIVEIWLTVGLNAGIISRKIFGMFV LMALVSTFVTTPLTQL 426 



Qy 746 KFVCEY RRPVAERAQDIG 763 

Db 427 VYPDSYRDGVRKSLSTPAEDDG 448 



RESULT 8 
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T00487 

probable potassium transport protein F19I3.29 - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 12-Feb-1999 #sequence_revision 12-Feb-1999 #text_change 09-Jul-2004 
C;Accession: T00487; B84764 

R;Rounsley, S.D.; Lin, X.; Ketchum, K.A.; Crosby, M.L.; Brandon, R.C.; Syk.es, S.M.; Kaul, S.; Mason, 
T.M.; Kerlavage, A.R.; Adams, M.D.; Somerville, C.R.; Venter, J.C. 

submitted to the EMBL Data Library, April 1998 

A; Description : Arabidopsis thaliana chromosome II BAC F19I3 genomic sequence. 

A;Reference number: Z14160 
A;Accession: T00487 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-792 <ROU> 

A;Cross-ref erences : UNIPROT : 06 4769 ; UNIPARC : UPI00000485F4 ; EMBL : ACOO 4238 ; NID : g30333 73 ; PID:g3033401 
A; Experimental source: cultivar Columbia 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; Fujii, C.Y.; Mason, T.M.; 
Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V.; Buell, C.R.; Ketchum, K.A.; Lee, J.J.; Ronning, C. 
M.; Koo, H.; Moffat, K.S.; Cronin, L.A.; Shen, M. ; VanAken, S . E . ; Umayam, L.; Tallon, L.J.; Gill, J. 
E.; Adams, M.D.; Carrera, A.J.; Creasy, T.H.; Goodman, H.M.; Somerville, C.R.; Copenhaver, G.P.; 
Preuss, D.; Nierman, W.C.; White, 0.; Eisen, J. A.; Salzberg, S . L . ; Eraser, CM.; Venter, J.C. 
Nature 402, 761-768, 1999 

A; Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. 

A;Reference number: A84420; MUID : 20083487 ; PMID: 10617197 

A;Accession: B84764 

A; Status: preliminary 

A; Molecule type: DNA 

A;Residues: 1-792 <STO> 

A; Cross-references: UNIPARC : UPI00000485F4 ; GB:AE002093; NID : g3033401 ; PIDN: AAC12845 . 1; GSPDB:GN00139 
C;Genetics : 

A;Gene: At2g35060; F19I3.29 
A;Map position: 2 

A;Introns: 50/3; 126/1; 208/1; 225/1; 312/1; 368/1; 627/2 

C; Super family : barley probable potassium transport protein HAKl 

Query Match 2.2%; Score 110; DB 2; Length 792; 

Best Local Similarity 18.5%; Pred. No. 0.83; 

Matches 122; Conservative 94; Mismatches 194; Indels 250; Gaps 28; 

Qy 60 AQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEED 119 

I : I : I : : : I : : : I I I : I : I 

Db 3 ARVEAATMGGEIDEEESDERGS MWDLD 29 

Qy 120 LKLDRQQDSAARDRTDMHRTWRETFL DNLRAAGLCV 155 

Db 30 QKLDQSMDEEAGRLRNMYREKKFSALLLLQLSFQSLGWYGDLGTSPLYVFYNTFPHGIK 89 

Qy 156 DQQDVQDGNTTVHYAL LSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLG 211 

I : I : : : I : I I I : I I I I II 
Db 90 DPEDIIGALSLIIYSLTLIPLLKYVFWC-KAND NGQGGTFA 130 

Qy 212 IPNVLLEWPDVPPEYYS — CRF-RVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPY 268 

Db 131 LYSLLCRHAKVKTIQNQHRTDEELTTYSRTTFHEHSF — AAKTKR 173 

Qy 269 GHEKKN LLGIHQLLAEGVLSAAFPLHD — GPFKTPPEGPQAPRLNQRQVL 316 

Db 174 WLEKRTSRKTALLILVLVGTCMVIGDGILTPAISVLSAAGGLRV NLPHISNGVW 228 

Qy 317 F QHWARWGKWNKYQPLDHVRRYFGEKVALYF AWLGFYTGWLLPAA 361 

I II: 11111:11:111 : 

Db 229 FVAWILVSLFSVQHYG TDRVGWLFAPIVFLWFLSIASIGMYNIWKHDTS 278 
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Qy 362 W GTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLC 397 

I : I : : : I : I : I : : : : | : 
Db 279 VLKAFSPVYIYRYFKRGGRDRWTSLGGIMLSITGIEALFADLSHFPVSAVQIAFTV 334 

Qy 398 LDCPFWLLSSACALAQAGRLFDH GGTVFFSLFMALWAVLLLEYWKRKSATL 448 

: I I I : : I III I : I : : : I : | : : | | | 

Db 335 IVFPCLLLAYSGQAAYIRRYPDHVADAFYRSIPGSVYWPMFIIATAAAIVASQATISATF 394 

Qy 449 AYRWDCSDYEDTEERPRPQFAASA PMTAPN PITGEDEPYF 488 

Db 395 SLVKQALAHGCF PRVKWHTSRKFLGQIYVPDINWILMILCIAVTAG F 442 

Qy 489 PERSRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLT 548 

Db 443 KNQSQIGNAYGTAWIVMLVTTLLMTLIMILVWRCHm^LV LI 484 

Qy 5 49 GSWNLV FILILSKI YVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQ 595 

: I : : I I I : I I I : : I : I I I I I I : I : 

Db 485 FTVLSLWECTYFSAMLFKIDQGGWVPLVIAAAFLLIMWVWHYG TLKRYEFE 536 



RESULT 9 
A45573 

genome polyprotein - hepatitis C virus (strain JT) 

N;Contains: capsid protein C; envelope protein M; hepacivirin (EC 3.4.21.98) (nonstructural protein 
NS3); major envelope protein E; nonstructural protein NSl; nonstructural protein NS2; nonstructural 
protein NS4a; nonstructural protein NS4b; nonstructural protein NS5 
C; Species: hepatitis C virus 

C;Date: 19-May-2000 #sequence_revision 19-May-2000 #text_change 09-Jul-2004 
C;Accession: A45573 

R;Tanak.a, T.; Kato, N. ; Nakagawa, M.; Ootsuyama, Y.; Cho, M.J.; Nakazawa, T.; Hijikata, M. ; 
Ishimura, Y.; Shimotohno, K. 
Virus Res. 23, 39-53, 1992 

A; Title: Molecular cloning of hepatitis C virus genome from a single Japanese carrier: sequence 

variation within the same individual and among infected individuals. 

A;Reference number: A45573; MUID : 92295714 ; PMID:1318627 

A;Accession: A45573 

A; Status: preliminary 

A; Molecule type: DNA 

A;Residues: 1-3010 <TAN> 

A;Cross-references: UNIPROT : Q00269 ; UNIPARC:UPI0000131E29; GB:D11168; GB:D01171; NID:g221612; PIDN: 

BAA01943.1; PID:g221613 

A; Experimental source: HCV-JT 

A;Note: sequence extracted from NCBI backbone (NCBIN: 106206, NCBIP: 106207) 

C; Superfamily : hepatitis C virus genome polyprotein 

C;Keywords: ATP; glycoprotein; hydrolase; nucleotide binding; P-loop; polyprotein; serine 

proteinase; transmembrane protein 

F; 2-115/Product : capsid protein C #status predicted <CPC> 

F; 116-191/Product : envelope protein M #status predicted <EPM> 

F; 192-389/Product : major envelope protein E #status predicted <MEE> 

F;390-729/Product: nonstructural protein NSl #status predicted <NS1> 

F; 730-1006/Product : nonstructural protein NS2 #status predicted <NS2> 

F; 1007-1615/Product : hepacivirin #status predicted <NS3> 

F; 1230-1237/Region: nucleotide-binding motif A (P-loop) 

F; 1312-1317/Region: nucleotide-binding motif B 

F; 1316-1319/Region: DEXH motif 

F; 1616-1862/Product : nonstructural protein NS4a #status predicted <N4A> 
F; 1863-2013/Product : nonstructural protein NS4b #status predicted <N4B> 
F; 2014-3010/Product : nonstructural protein NS5 #status predicted <NS5> 

Query Match 2.2%; Score 108; DB 1; Length 3010; 

Best Local Similarity 20.4%; Pred. No. 7.4; 

Matches 160; Conservative 86; Mismatches 276; Indels 262; Gaps 41; 
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Qy 128 SAARDRTDMHRTWRETFLDNLRAAGLCVDQ QDVQDGNTTVHYALLSASWAVLCYY 182 

II II II III III I : : I I I I I 

Db 314 SGHRMAWDMMMNWSPT TALWSQLLRIPQAWDMVAGAHWGVL AGLAYY 362 

Qy 183 AEDLRLKLPLQELPNQASNWSAGLLAWL GIPNVLLEWPDVPPEYYSCRFRVNKLPR 239 

: I I : I : I I : I I : 

Db 3 63 SMVGNWAKVLIVMLLFAGVDGVT YTTG 3 89 

Qy 240 FLGSDNQDT FFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLH 294 

II : I III : I : : : | : | : : | | : | | | 

Db 390 — GSQARHTQSVTSFFTQGPAQRI — QLINTNGSWHINRTALNCNESLNTGFFAALFYAH 445 

Qy 2 95 DGPFKTPPE GP QAPR-LNQRQVLFQHWA 321 

II II 111:11 : I : I 

Db 446 KFNSSGCPERMASCSSIDKFAQGWGPITYTEPRDLDQRPYCW-HYAPRQCGIVPASQVCG 504 

Qy 3 22 RWGK WNKYQPLDHVRRYFGEKVALYFAWL- 3 50 

II 1:1 : II I : 

Db 505 PVYCFTPSPVWGTTDRSGAPTYNWGANETDVLLLNNTRPPQ — GNWFG CTWMN 556 

Qy 351 — GFYTGWLLPAAWG TLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCL-DCPF 402 

II I : I II II : I III : I I : I I : 

Db 557 STGFTKTCGGPPCNIGGVGNLTLTCPTDCFRKHPE-ATYTKCGSGP — WLTPRCIVDYPY 613 

Qy 403 WLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEE 462 

II:: I I I : : I I : I : III I I : II: 

Db 614 RLWHYPCTVNFTIFKVRMYVGGVEH — RLSA — ACNWTRGERCDLED 656 

Qy 463 RPRPQ FAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVS 517 

II: : : II I II | : : | | : : : | | 

Db 657 RDRSELSPLLLSTTEWQTLPCSFT TLPALSTGLIHLHQNIVDVQYLYGIGSAWS 711 

Qy 518 IILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMH 577 

: : : : : I I I I II : I : : : : : | : : : : : 
Db 712 FVIKWEYIVLLF LLLADARVCACLW MMLLIAQAEAALENLW LN 755 

Qy 5 78 RTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGV 626 

I I 1:1:: II I I I I II : I : I I 
Db 756 AASLAGADG ILSFLVFFCAAWYI KGRLV — PGAAYALYGVWPLLLLLLALP 804 

Qy 627 RNEECAAGGCLIELAQELLVIMVG— KQVINNMQEVLIPKLKGWWQKFRLRSK 677 

I : I I 1:11 : : : I : : I I I I I : : 

Db 805 PRAYAMDREMAASCGG WFVGLILLTLSPHYKVFLARLIWWLQYFITRAE 854 

Qy 6 78 KRKAGASAGASQGPWEDDYELVPC EGLFDEYLEMVLQFGFVTIFVAACPLAPLFAL 733 

Db 855 AHLCVWVPPLNVRGGRDAIILLTCAAHPELIFDITKLLLAILGPLMVLQAAITAMPYFVR 914 

Qy 734 LNNWVEIRLDARK FVCEYRRPVAERAQDIGIWFHILAGLT 773 

: : I I : I : : I : I I I I I I I 
Db 915 AQGLIRACMLVRKVAGGHYVQMAFMKLAALTGTYVYDHLTPL QD WAH— AGLR 965 

Qy 7 74 HLAV 777 

I I I 

Db 966 DLAV 969 

RESULT 10 

T11129 

cytochrome-c oxidase (EC 1.9.3.1) chain I - acorn worm mitochondrion 
C; Species: mitochondrion Balanoglossus carnosus 

C;Date: 16-Jul-1999 #sequence_revision 16-Jul-1999 #text_change 09-Jul-2004 
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C;Accession: T11129 

R;Castresana, J.; Feldmaier-Fuchs, G.; Yokobori, S.; Satoh, N.; Paabo, S. 
Genetics 150, 1115-1123, 1998 

A; Title: The mitochondrial genome of the hemichordate Balanoglossus carnosus and the evolution 
deuterostome mitochondria. 

A;Reference number: Z17250; MUID : 99016090 ; PMID:9799263 
A;Accession: T11129 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-519 <CAS> 

A;Cross-references: UNIPROT:063612; UNIPARC : UPI000008CD49 ; EMBL : AF051097 ; NID:g3065580; PID: 
g3065692; PIDN: AAD11955 . 1 
C;Genetics : 

A; Genome: mitochondrion 

C; Superfamily : cytochrome-c oxidase chain I; cytochrome-c oxidase chain I homology 

C;Keywords: copper; electron transfer; heme; iron; magnesium; membrane-associated complex; 

metalloprotein; mitochondrion; oxidoreductase ; respiratory chain 

F; 11-457/Domain : cytochrome-c oxidase chain I homology <C01> 

F; 61, 378/Binding site: heme a iron (His) (axial ligands) #status predicted 

F; 240, 290, 291/Binding site: copper (His) #status predicted 

F; 240-244/Cross-link: 1 ' -histidyl-3 ' -tyrosine (His-Tyr) #status predicted 
F; 244/Binding site: oxygen (Tyr) #status predicted 

F; 368/Binding site: magnesium (His) (shared with chain II) #status predicted 
F; 376/Binding site: heme a3 iron (His) (axial ligand) #status predicted 

Query Match 2.2%; Score 106.5; DB 2; Length 519; 

Best Local Similarity 17.8%; Pred. No. 0.93; 

Matches 94; Conservative 51; Mismatches 167; Indels 215; Gaps 18; 

Qy 4 72 APMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAWVMCLVSIILYRAIMAIWSR 531 

I : I I : I : I : I I : I : : : : : | | : : 

Db 39 AELAQPGPLLGDDQIY NVIVTAHAFVMIFFMVMPIMIGG 7 7 

Qy 532 SGNTLL AAW ASRIA 5 45 

I I I I 11:1 
Db 78 FGNWLLPLMLGAPDMAFPRLNNMSFWLLPPSFLLLLSSAGVESGVGTGWTVYPPLAGNMA 137 

Qy 5 46 SLTGSWNLVFILIL SKIYVSLAHVLTRWEMHRTQTKFE— DAFTLKVFIFQFVNFY 600 

III : I I I I I : : : I I : I : I I I I : 

Db 138 HAGGSVDLAIFSLHLAGISSILGAINFMTTVINMRAPGVRFDRLPLFVWSVFITVILLLL 197 

Qy 601 SSPV Y 605 

I I I 

Db 198 SLPVLAGAITMLLTDRNLNTSFFDPAGGGDPILYQHLFWFFGHPEVYILILPAFGMISHV 257 

Qy 606 lAFFKGRF — VGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINN M 557 

Db 258 lAFYSGKKEPFGYLGMVYAMIAI GILGFLVWAHHMFTVGMDVDTRAYFTAAT 309 

Qy 558 QEVLIP KLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQ 714 

: : I I : I I : I I I : : 

Db 310 MVIAVPTGIKIFSWL ATLHGSALQWE APLLWA 341 

Qy 715 FGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGI WFHI 758 

I I I : I II::::: : | : | II | | : 

Db 342 LGFVFLFTVGGLTG— IVLSNSSLDWMHDTYYWAHFHYVLSMGAVFGIFAGFIHWFPL 399 

Qy 759 LAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCR 828 

I I I I 11:1 I I I I I I 

Db 400 FTGLTLHPV WTKFHFWTMFLGVNLTFFPQHFLGLAGMPRR 439 

Qy 829 YRAFRDDDGHYSQTYWNLLA IRLAFVIVFEHW FSVGRL 86 7 

I : I : I I I : I : : I I I I : I : : I : II 



http://es/ScorBAccessWeb/Geatem.action?AppId=105525...24_083152_us-10-552-515-l.rpr&ItemType=4&startByte=0 (15 of 20)10/10/2008 8:49:35 AM 



SCORE Search Results Details for Application 10552515 and Search Result 20080624_083152_us-10-552-515-l.rpr. 

Db 440 YSDYPD AYTTWNVLSSVGSIVSLASVIIFLAILWEAFTARRL 481 



RESULT 11 
B86088 

probable citrate permease Z5523 [imported] - Escherichia coli (strain 0157 :H7, substrain EDL933) 
C; Species: Escherichia coli 

C;Date: 16-Feb-2001 #sequence_revision 16-Feb-2001 #text_change 09-Jul-2004 
C;Accession: B86088 

R;Perna, N.T.; Plunkett III, G.; Burland, V.; Mau, B.; Glasner, J.D.; Rose, D.J.; Mayhew, G.F.; 
Evans, P.S.; Gregor, J.; Kirkpatrick, H.A.; Posfai, G.; Hackett, J.; Klink, S.; Boutin, A.; Shao, 
Y.; Miller, L.; Grotbeck, E.J.; Davis, N.W.; Lim, A.; Dimalanta, E.; Potamousis, K.; Apodaca, J.; 
Anantharaman, T.S.; Lin, J.; Yen, G.; Schwartz, D.C.; Welch, R.A.; Blattner, F.R. 

Nature 409, 529-533, 2001 

A; Title: Genome sequence of enterohemorrhagic Escherichia coli 0157 :H7. 
A;Reference number: A85480; MUID : 21074935 ; PMID: 11206551 

A;Accession: B86088 
A; Status: preliminary 
A;Molecule type: DNA 
A;Residues: 1-438 <STO> 

A;Cross-ref erences : UNIPROT : Q8X4P7 ; UNIPARC : UPI00000D0D9E ; GB:AE005174; NID:gl2518889; PIDN: 

AAG59166.1; GSPDB : GNOO 1 45 ; UWGP:Z5523 

A; Experimental source: strain 0157 :H7, substrain EDL933 
C;Genetics : 
A;Gene: Z5523 

C; Superf amily : citrate utilization determinant 

Query Match 2.1%; Score 106; DB 2; Length 438; 

Best Local Similarity 23.3%; Pred. No. 0.82; 

Matches 57; Conservative 31; Mismatches 69; Indels 88; Gaps 13; 

Qy 339 FGEKVAL-YFAWLGFYT GWLLPAAWGTLVFLVGCFLVFS DIP T 381 

II III I I I I I I I : I : I I : I I I : I I I : I 

Db 173 FGGWALGLSAWLPFATGSETVMAEWGWRVP-FFIGVLLAPVGCWLRLSLENDVPEPAHN 231 

Qy 382 QELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTV — FFSLF-MALWAVLLL 438 

: : I : : I : I : : I I I I I I : I I I II I 

Db 232 KKAAASESAFSL LMQHKATIAN-GILLAIGSTVATYISLFYYGTWAAKYL 280 

Qy 439 EYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRML 498 

I : : I II : I 

Db 281 GMNQNY SHAAMLL 293 

Qy 499 AGSWIWMVAVWMC LVSI ILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNL 554 

I I : I : I : : I : I I : I : : I I I : I I : : : 

Db 294 AGVITFVGALLVGMLCDSVGRKKLILISRVMVLICSWPSFWLLVNYPS PGMLLTV 348 

Qy 555 VFILI 559 

II::: 

Db 349 VFVMV 353 



RESULT 12 
E91240 

probable membrane transport / symporter protein ECs4893 [imported] - Escherichia coli (strain 0157: 

H7, substrain RIMD 0509952) 
C; Species: Escherichia coli 

C;Date: 18-Jul-2001 #sequence_revision 18-Jul-2001 #text_change 09-Jul-2004 
C;Accession: E91240 

R;Hayashi, T.; Makino, K.; Ohnishi, M. ; Kurokawa, K.; Ishii, K.; Yokoyama, K.; Han, C.G.; Ohtsubo, 
E.; Nakayama, K.; Murata, T.; Tanaka, M. ; Tobe, T.; lida, T.; Takami, H.; Honda, T.; Sasakawa, C.; 
Ogasawara, N.; Yasunaga, T.; Kuhara, S.; Shiba, T.; Hattori, M. ; Shinagawa, H. 
DNA Res. 8, 11-22, 2001 
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A; Title: Complete genome sequence of enterohemorrhagic Escherichia coli 0157 :H7 and genomic 

comparison with a laboratory strain K-12. 

A;Reference number: A99629; MUID : 21156231 ; PMID: 11258796 

A;Accession: E91240 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-438 <HAY> 

A;Cross-references: UNIPROT : Q8X4P7 ; UNIPARC : UPI00000D0D9E ; GB:BA000007; PIDN:BAB38316 . 1; FID: 
gl3364369; GSPDB : GN00154 

A; Experimental source: strain 0157 :H7, substrain RIMD 0509952 

C;Genetics: 

A;Gene: ECs4893 

C; Superfamily : citrate utilization determinant 

Query Match 2.1%; Score 106; DB 2; Length 438; 

Best Local Similarity 23.3%; Fred. No. 0.82; 

Matches 57; Conservative 31; Mismatches 69; Indels 88; Gaps 13; 

Qy 339 FGEKVAL-YFAWLGFYT GWLLFAAWGTLVFLVGCFLVFS DIP T 381 

Db 173 FGGWALGLSAWLFFATGSETVMAEWGWRVF-FFIGVLLAPVGCWLRLSLENDVPEPAHN 231 



Qy 382 QELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTV— FFSLF-MALWAVLLL 438 

: : I : : I : I : : I I I I I I : I I I II I 

Db 232 KKAAASESAFSL LMQHKATIAN-GILLAIGSTVATYISLFYYGTWAAKYL 280 



439 EYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRML 498 

I :: I II : I 

281 GMNQNY SHAAMLL 293 



Qy 499 AGSWIWMVAWVMC LVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNL 554 

I I : I : I : : I : I I : I : : I I I : I I : : : 

Db 294 AGVITFVGALLVGMLCDSVGRKKLILISRVMVLICSWPSFWLLVNYPS PGMLLTV 348 



Qy 555 VFILI 559 

II::: 

Db 349 VFVMV 353 



RESULT 13 
JC1346 

dopamine beta-monooxygenase (EC 1.14.17.1) precursor - mouse 

C; Species: Mus musculus (house mouse) 

C;Date: 30-Sep-1993 #sequence_revision 30-Sep-1993 #text_change 09-Jul-2004 

C;Accession: JC1346 

R;Nakano, T.; Kobayashi, K.; Saito, S.; Fujita, K.; Nagatsu, T. 
Biochem. Biophys. Res. Commun. 189, 590-599, 1992 

A; Title: Mouse dopamine beta-hydroxylase : primary structure deduced from the cDNA sequence and axon/ 
intron organization of the gene. 

A;Reference number: JC1346; MUID : 93080618 ; PMID:1280432 

A;Accession: JC1346 
A; Molecule type: mRNA 
A;Residues: 1-621 <NAK> 

A;Cross-ref erences : UNIPROT : Q6 4237 ; UNIPARC : UPI0000029950 ; GB:S50200; NID:g250872; PIDN : AAB24330 . 1 ; 
PID:g260873 

C; Comment: This enzyme catalyzes the hydroxylation of dopamine to norepinephrine. 

C;Genetics : 

A;Introns: 117/3; 166/3; 252/3; 346/1; 401/3; 449/3; 462/3; 482/3; 525/2; 578/3 

C;Keywords: catecholamine biosynthesis; copper; glycoprotein; monooxygenase; oxidoreductase ; 

phosphoprotein 

F; 1-43/Domain: (or 1-46) signal sequence #status predicted <SIG> 

F; 44-521/Product : (or 47-621) dopamine beta-monooxygenase #status predicted <MAT> 
F ; 300-523/Domain : peptidylglycine monooxygenase I homology <PGM> 
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F; 58, 188, 476, 570/Binding site: carbohydrate (Asn) (covalent) #status predicted 

F;350, 528/Binding site: phosphate (Ser) (covalent) (by calmodulin-dependent kinase II) #status 
predicted 

Query Match 2.1%; Score 105; DB 2; Length 621; 

Best Local Similarity 20.2%; Pred. No. 1.6; 

Matches 113; Conservative 61; Mismatches 187; Indels 198; Gaps 29; 



Qy 110 ADFVLVWEE DLKLDRQQD SAARDRTDMHRTWRETFLDNLRA 150 

I I : : : I : : I I I I I III: : : I 
Db 102 ADLIMLWSDGDRAYFADAWSDRKGQIHLDSQQDYQLLQAQRTRDGLSLLFKRPF 155 



Qy 151 AGLCVDQQDVQDGNTTVH — YALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLA 208 
Db 156 — VTCDPKDYVIEDDTVHLVYGILEE PFQSL — EAINTS 190 



Qy 209 WLGIPNVLLEV VPDVPPEYYSCRFRVNKLPRFLGSDNQDTFF TS 252 

I : III I : I : : | | | | | : | : : 
Db 191 — GLHTGLLRVQLLKSEVPTPSMPEDVQTMDIRA PDILIPDNEQTYWCYITELPPRF 245 



Qy 253 TKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGP — FKTPPEGPQAPRL 310 

: I I : : I : I : : : | || I I I : I I II: II 
Db 2 46 PRHHIIMYEAIV-TEGNEALVHHMEVFQCAAE SEDFPQFNGPCDSKMKPD RL 296 



Qy 311 NQRQVLFQHWARWGKWNKYQPLD HVRRYFGEK VAL 345 

I : : I I I I I : : I : I : : | 

Db 297 NYCRHVLAAWALGAK-AFYYPKEAGVPFGGPGSSPFLRLEVHYHNPRKIQGRQDSSGIRL 355 



Qy 346 -YFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLD-CPFW 403 

I I I I : : : I I : I I I I : I : III 

Db 356 PYTATLRRYDAGIMELGLVYTPLMA IPPQE TAFVLTGYCTDKCTQM 401 

Qy 404 LLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEER 463 

I : I : I I I I I : I I : I I 

Db 402 ALQDSGIHIFASQLHTH LTGRKWTVLAR DGQER 435 



Qy 464 PRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAWVMCLVSIILYRA 523 

III I I I I I : : : I 
Db 436 KE VNRDNHYSP-HFREIRMLKKWTVYPGDVLITSC 470 



Qy 524 IMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKF 583 

: II I : : I I I : : I : I : | : : : 

Db 4 71 TYNTENKTL ATVGG FGILEEMCVNYVHYYPQTELELCKSAV 511 



Qy 584 EDAFTLKVFIFQFVNFYSS 602 

: I I I I I I : I I 
Db 512 DDGFLQK — YFHMVNRFSS 528 



RESULT 14 
H82555 

c-type cytochrome biogenesis membrane protein XF2460 [imported] - Xylella fastidiosa (strain 9a5c) 
C;Species: Xylella fastidiosa 

C;Date: 18-Aug-2000 #sequence_revision 20-Aug-2000 #text_change 09-Jul-2004 
C;Accession: H82555 

R; anonymous. The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and 

Analysis, Sao Paulo, Brazil. 
Nature 406, 151-157, 2000 

A; Title: The genome sequence of the plant pathogen Xylella fastidiosa. 
A;Reference number: A82515; MUID : 20365717 ; PMID : 10910347 

A; Note: for a complete list of authors see reference number A59328 below 
A;Accession: H82555 
A; Status: preliminary 



http://es/ScorBAccessWeb/Geatem.action?AppId=105525...24_083152_us-10-552-515-l.rpr&ItemType=4&startByte=0 (18 of 20)10/10/2008 8:49:35 AM 



SCORE Search Results Details for Application 10552515 and Search Result 20080624_083152_us-10-552-515-l.rpr. 



A; Molecule type: DNA 
A;Residues: 1-646 <SIM> 

A; Cross-references: UNIPROT : Q9PAN5 ; UNIPARC : UPI00000C2A60 ; GB:AE004054; GB:AE003849; NID:g9107645; 
PIDN:AAF85259.1; GSPDB : GN00128 ; XFSC:XF2460 
A; Experimental source: strain 9a5c 

R;Simpson, A.J.G.; Reinach, F.C.; Arruda, P.; Abreu, F.A.; Acencio, M. ; Alvarenga, R.; Alves, L.M. 
C; Araya, J.E.; Baia, G.S.; Baptista, C.S.; Barros, M.H.; Bonaccorsi, E.D.; Bordin, S.; Bove, J.M. 
Briones, M.R.S.; Bueno, M.R.P.; Camargo, A.A. ; Camargo, L.E.A.; Carraro, D.M.; Carrer, H.; Colauto, 
N.B.; Colombo, C; Costa, F.F.; Costa, M.C.R.; Costa-Neto, CM.; Coutinho, L.L.; Cristofani, M. ; 
Dias-Neto, E.; Docena, C; El-Dorry, H.; Facincani, A. P.; Ferreira, A.J.S. 
submitted to GenBank, June 2000 

A;Authors: Ferreira, V.C.A.; Ferro, J. A.; Fraga, J . S . ; Franca, S.C.; Franco, M.C.; Frohme, M. ; 
Furlan, L.R.; Garnier, M. ; Goldman, G.H.; Goldman, M.H.S.; Gomes, S.L.; Gruber, A.; Ho, P.L.; 
Hoheisel, J.D.; Junqueira, M.L.; Kemper, E.L.; Kitajima, J. P.; Krieger, J.E.; Kuramae, E.E.; 
Laigret, F.; Lambais, M.R.; Leite, L.C.C.; Lemos, E.G.M.; Lemos, M.V.F.; Lopes, S.A.; Lopes, C.R.; 
Machado, J. A.; Machado, M.A.; Madeira, A.M.B.N.; Madeira, H.M.F.; Marino, C.L.; Marques, M.V.; 
Martins, E.A.L. 

A;Authors: Martins, E.M.F.; Matsukuma, A.Y.; Menck, C.F.M.; Miracca, E.C.; Miyaki, C.Y.; Monteiro- 
Vitorello, C.B.; Moon, D.H.; Nagai, M.A.; Nascimento, A.L.T.O.; Netto, L.E.S.; Nhani Jr., A.; 
Nobrega, E.G.; Nunes, L.R.; Oliveira, M.A.; de Oliveira, M.C.; de Oliveira, R.C.; Palmieri, D.A.; 
Paris, A.; Peixoto, B.R.; Pereira, G.A.G.; Pereira Jr., H.A.; Pesquero, J.B.; Quaggio, R.B.; 
Roberto, P.G.; Rodrigues, V.; Rosa, A.J. de M.; de Rosa Jr., V.E.; de Sa, R.G.; Santelli, R.V.; 
Sawasaki, H.E. 

A;Authors: da Silva, A.C.R.; da Silva, F.R.; da Silva, A.M.; Silva Jr., W.A. ; da Silveira, J.F.; 

Silvestri, M.L.Z.; Siqueira, W.J.; de Souza, A.A. ; de Souza, A. P.; Terenzi, M.F.; Truffi, D.; Tsai, 

S.M.; Tsuhako, M.H.; Vallada, H.; Van Sluys, M.A. ; Ver jovski-Almeida, S.; Vettore, A.L.; Zago, M.A. 

Zatz, M.; Meidanis, J.; Setubal, J.C. 

A; Reference number: A59328 

A; Contents: annotation 

C;Genetics : 

A;Gene: XF2460 

C; Superfamily : nrfE protein 

Query Match 2.1%; Score 104; DB 2; Length 646; 

Best Local Similarity 18.9%; Pred. No. 2; 

Matches 95; Conservative 46; Mismatches 152; Indels 210; Gaps 19; 

Qy 167 VHYALLSASWAVLC YYAEDLRLKLPLQELPNQASNWSA GLLAWLGI 212 

I : I I : : : I : I III: II I : I I III 
Db 45 VQLSLLAGAFALLTYAFLGNDFSVQYVAENSHSLLP — TLYRSTAVWGAHEGSLLLW 99 

Qy 213 PNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEK 2 72 

III : I I : I I I : 
Db 100 — VLL LAGWTASVALRSHTLPATLSA 123 

Qy 2 73 KNLLGIHQLLAEGVLSAAFPLHDGPF KTPPEGPQAPRLNQRQVLFQH 319 

Db 124 -RILGVLGLIALGFL-ALILFTSNPFARLLPAVPEGNDLNPLLQDPGMIVHPPLLYAGYI 181 

Qy 320 WARW GKWNKYQPLDHVRRYFG 340 

Db 182 GFAVPFAFAVAVLLEGRIDPTWLRWSRPWTHTAWALLTLGIALGSWWAYYELGWGGWWFW 241 

Qy 341 EKV — ALYFAWL GFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQE 383 

: I I : I I I : I I I I : : I : I I I I I : I 

Db 242 DPVENASFMPWLIGVALIHSQAITDKRGSFTHWTLLLAITAFALALLGTFLVRSSVLT— 299 

Qy 384 LCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKR 4 43 

I : I I : I : : I : I I I 
Db 300 SVHAFAADPV RGAFILLLIFTLIGGALLL 328 

Qy 4 44 KSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSW 503 

: I II 1:1 : I II : I : : : 

http://es/ScorBAccessWeb/Geatem.action?AppId=105525...24_083152_us-10-552-515-l.rpr&ItemType=4&startByte=0 (19 of 20)10/10/2008 8:49:35 AM 



SCORE Search Results Details for Application 10552515 and Search Result 20080624_083152_us-10-552-515-l.rpr. 



Db 329 YARRAPQL — TPVTINMQQRFTPVSRETLLLLNNLL 362 

Qy 504 IWMVAVWMCLVSIILYRAIMAIWSRSGNTLLAAWASRIA SLTGSWNLVFILI 559 

: I : I : : II I : I I : : I I : 

Db 363 LTCACAMVLL GTLYPLLADALALGQLSVGPPYFGPLFTLL 402 

Qy 560 LSKIYVSL-AHVLTRWEMHRTQT 581 

: : : I I III: I 

Db 403 MTPLIVLLPLGPFTRWQREHPST 425 



RESULT 15 
JQ2034 

RNA-directed RNA polymerase (EC 2.7.7.48) - beet cryptic virus 3 
C; Species: beet cryptic virus 3 

C;Date: 03-May-1994 #sequence_revision 03-May-1994 #text_change 09-Jul-2004 
C;Accession: JQ2034 

R;Xie, W.S.; Antoniw, J.F.; White, R.F. 
J. Gen. Virol. 74, 1467-1470, 1993 

A; Title: Nucleotide sequence of beet cryptic virus 3 dsRNA2 which encodes a putative RNA-dependent 
RNA polymerase. 

A;Reference number: JQ2034; MUID : 93329401 ; PMID:8336129 
A;Accession: JQ2034 
A; Molecule type: mRNA 
A;Residues: 1-478 <XIE> 

A;Cross-references: UNIPROT : Q86632 ; UNIPARC : UPI00000EE5F9 ; GB:S63913; NID:g407557; PIDN:AAB27624 . 1; 

PID:g407558 

C;Genetics : 

A;Map position: segment RNA2 

C; Keywords: nucleotidyltransferase; reverse transcriptase 

Query Match 2.1%; Score 103.5; DB 2; Length 478; 

Best Local Similarity 20.7%; Pred. No. 1.5; 

Matches 53; Conservative 34; Mismatches 96; Indels 73; Gaps 11; 

Qy 59 RAQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEE 118 

: I : I : I I I I I : I III: Mill: : II 

Db 109 KARAFDVNTELDKVPYEQSSSAGYGYRSHKGPPGGE— THMRAISRVKPTLMTAIRPDEE 166 

Qy 119 DLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTV 167 

I : I I : I : : I : I I 

Db 167 GPEYTILESVPDIGYTRTQLADLREKTKVRGVWGRAF 203 

Qy 168 HYALLSASWA VLCYYAEDLRLKLP — LQELPNQASNWSAGLLAWLGIPN 214 

Db 204 HYILIEGTAARPLLENFMLGTTFMHIGSDPQLSVPRILHQMKREGSKWLYA-LDWSSFDS 262 

Qy 215 VLLEVVPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEK — 272 

I : I I II : I : : I : : I I I : : : | : | 

Db 263 SVTRFEINCAF— NLLKERIEFPNEET ELAFE-LSRILFKHKKLA 304 

Qy 273 KNLLGIHQLLAEG 285 

Db 305 APDGNIYMIHKGIPSG 320 



Search completed: June 24, 2008, 08:37:14 
Job time : 37 sees 
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